An explosive growth of data science in the last few years enriched the world with ready to use tools for free (especially, R and python). Fintech is bringing a new way the finance industry operates, as well as changing the finance itself. However, some well-established spheres keep 'status quo' untill nowadays. Thus, financial and economic modeling, particularly in oil & gas, mostly relies on commercial-off-the-shelf software and MS Excel spreadsheets, the later is the industry standard de- facto. This article is to show an alternative, data-science based approach, with proper python scripts and concluding analysis of the pros and cons of this approach.
Financial modeling is a discipline that requires domain knowledge in investment and accountancy, contracting laws, and other finance-related skills. To meet modern trends, that requirements should supplement programming skills as well.
As of today, the complex project financial models mostly are developed by commercial software, or by flexible spreadsheets in Microsoft Excel. While both ways are beneficial, they are followed by side-effects. The first one is often a ‘black-box’ software with little chances to deep-dive into exact formulas, calculus precision, etc. Moreover, such solutions often require cloud-based storage of the financial modeling results, and that could be a barrier for a reason of data sensitivity. Another choice - MS Excel - is prone to some issues with increase of modeling complexity: too many formulas with inside- and outside- sheet’s links are overwhelming to review, and such option as Monte Carlo simulation needs to a quite complex arrangement to apply within MS Excel (not to mention cycling formula’s issues, large-size files failures; also, manipulation with complex models may lead to inconsistencies and modeler’s headache).
It needs to point out that such features as transparency, ease of to review a model (i.e. developed by contractors) and ease to test the model are probably even more important. Besides, the ability to co-integrate a new model with a corporate modeling environment (i.e. to run it under changing external factors) is multiplying its value.
The fast evolving field of data science and fintech leads to the development and maturing of efficient open-source modeling ecosystems. Thus, R and Python today provides a great opportunity for financial modeling with the stable and concise programming language, a wide range of supplementary packages, implementation tools aka ‘jupyter notebook’, and a lot of learning and Q&A sources easily available online for free.
Nowadays, python is a standard modeling tool in investment banks and hedge funds, it’s a tool for ABS cash flow assessment by SEC. There are some papers regarding spreadsheet substitution by python, although there is room for improvement in the field.
To tackle the problem mentioned above, a practical approach for project financial modeling with python is presented below. A case of upstream production shared agreement financial modeling is discussed to fill the gap in oil&gas industry.
The rest of the paper is organized as follows: firstly introductory description of modeling approach and tools will be discussed, secondly, the step-by-step scripting approach to developing a model will be presented. Thirdly, the risk assessment approach will be under consideration in Part 2. Concluding remarks provide some inference on this research.
The case under consideration is about an independent oil & gas company (IOC) which is looking for a decision either to enter the project and sign a proper PSA with the State. This particular upstream project is determined by the production sharing agreement (PSA) environment, presented in a book (A. Inkpen and M.H. Moffett, The Global Oil&Gas Industry Management, Strategy & Finance. 2012). All the following input data is based on the case from the book, reconciling the results.
To come up with the answer, the following steps will be proceeded:
For the implementation of this project the following python packages were used:
Due to the simplicity of this example, all the data inputs are inserted directly to the Jupyter Notebook and no reading of external sources is required. Although it is worth to mention, that ease of dealing with external data is a strong capability of python.
# import necessary python packages
import numpy as np
import pandas as pd
#import scipy as sc
from scipy.stats import truncnorm, triang
from scipy.optimize import minimize
from multiprocessing import Pool
import matplotlib.pyplot as plt
%matplotlib inline
import holoviews as hv
from holoviews.util.settings import OutputSettings
hv.extension('bokeh')
# Set Production Profile (values in thousands of barrels) for each year
df = pd.DataFrame([0, 0, 578, 6100, 9420, 12400, 10850, 9494, 8307,
7269, 6360, 5565, 4869, 4261, 3728, 3262, 2854,
2498, 2185], # *1000 bbl
index=[*range(1,20)], # Set the years numbering from 1 to 19
columns=['Production (1,000 bbl)']) # Set the name of the column
# Calculate cumulative production
df['Cumulative Production (1,000 bbl)'] = df['Production (1,000 bbl)'].cumsum()
# Plot production and cumulative production bar chart
ax=df.plot(kind='bar', secondary_y=['Cumulative Production (1,000 bbl)'],
legend='reverse', figsize=(12,7), alpha=0.8,
title='Yearly Production Profile', table=True)
# Some finer adjustment for the plot
ax.get_xaxis().set_visible(False)
tb = ax.tables[0]
tb.auto_set_font_size(False)
tb.set_fontsize(8)
# We will remove cumulative production from the dataframe
df.drop('Cumulative Production (1,000 bbl)', axis=1, inplace=True)
# Set a price USD per bbl. for each year
df['Oil Price ($/bbl)'] = 20. # $/bbl
# Calculate Gross Revenue, in thousand USD
df['Gross Revenue ($m)'] = df[['Production (1,000 bbl)',
'Oil Price ($/bbl)']].prod(axis=1) # $m
# Calculate Royalty and net Revenue
royalty_rate = 10/100 # 10%
df['Royalty ($m)'] = df['Gross Revenue ($m)'] * royalty_rate
df['Net Revenue ($m)'] = df['Gross Revenue ($m)'] - df['Royalty ($m)']
df.index.name = 'Year'
# Print the resulting dataframe
df.T.style.format("{:,.0f}")
# Capital Costs are given in a referenced book
df['Capital Costs ($m)'] = [30000, 40000, 100000, 60000, 70000] + [0]*14
# Find first of operation year
first_year_of_operation = df.index[df['Production (1,000 bbl)']>0][0]
# 5 years straight line depreciation
years_sl = 5
# Calculate depreciation, starting no earlier than 1st year of operation
depr_schedule = pd.DataFrame(0., index=df.index,
columns=df.index[df['Capital Costs ($m)']>0])
# Assign yearly depreciation values
depr_schedule.loc[1:years_sl,:] = df['Capital Costs ($m)'].\
where(df['Capital Costs ($m)']>0) / years_sl
# Shift depreciation to proper years
depr_schedule = depr_schedule.apply(lambda x:
x.shift((max(x.name,
first_year_of_operation)
- 1), fill_value=0), axis=0)
# Now print depreciation schedule for the Capex from each year
depr_schedule.T
# Finally, sum all depreciation schedule into net Depreciation
df['Depreciation ($m)'] = depr_schedule.sum(axis=1)
# And print the resulting DataFrame
df[['Capital Costs ($m)', 'Depreciation ($m)']].T.style.format("{:,.0f}")
# Operating costs per bbl of production from the referenced book
df['Operating Cost ($/1000 bbl)'] = [0, 0, 5.5, 2.6, 2.4, 2.3, 2.36, 2.4, 2.46,
2.54, 2.64, 2.72, 2.82, 2.94, 3.08, 3.24,
3.4, 3.6, 3.4] # $/1000 bbl
# Calculate Operating Expence
df['Operating Expense ($m)'] = df[['Production (1,000 bbl)',
'Operating Cost ($/1000 bbl)']].prod(axis=1)
# Get Total Expences
df['Total Expenses ($m)'] = df[['Operating Expense ($m)',
'Depreciation ($m)']].sum(axis=1)
df[['Operating Cost ($/1000 bbl)', 'Operating Expense ($m)',
'Total Expenses ($m)']].T.style.format("{:,.2f}")
# Cost recovery is limited by 50% of Gross Revenue
cost_recovery_limit = 50/100 # 50%
df['Cost Recovery Limit ($m)'] = df['Gross Revenue ($m)'] * cost_recovery_limit
# Cost Recovery Carry Forward
df['C/R C/F ($m)'] = (df['Total Expenses ($m)'] -
df['Cost Recovery Limit ($m)']).cumsum().clip(lower=0)
df['Cost Recovery ($m)'] = np.minimum(df['Total Expenses ($m)'] +
df['C/R C/F ($m)'].shift(1, fill_value=0),
df['Cost Recovery Limit ($m)'])
df[['Cost Recovery Limit ($m)', 'C/R C/F ($m)',
'Cost Recovery ($m)']].T.style.format("{:,.0f}")
# Calculate Profit Oil
df['Total Profit Oil ($m)'] = df['Net Revenue ($m)'] - df['Cost Recovery ($m)']
# Calculate split of profit oil between IOC and State
profit_oil_split = 0.4 # 60% going to State and 40% to IOC
df['State Profit Oil Share ($m)'] = df['Total Profit Oil ($m)'] * \
(1 - profit_oil_split)
df['IOC Profit Oil Share ($m)'] = df['Total Profit Oil ($m)'] * \
profit_oil_split
df[['Total Profit Oil ($m)', 'State Profit Oil Share ($m)',
'IOC Profit Oil Share ($m)']].T.style.format("{:,.0f}")
# Signature bonus in case a PSA is signed
df['Signature Bonus ($m)'] = [1e4] + [0.]*18 # M$10
# Calculate operating income
operating_income = (df[['Cost Recovery ($m)',
'IOC Profit Oil Share ($m)']].sum(axis=1) -\
df[['Total Expenses ($m)',
'Signature Bonus ($m)']].sum(axis=1))
# Create function and calculate loss carry forward (no term limits here)
loss_carry_forward = np.frompyfunc(lambda a,b: a+b if a < 0 else b, 2, 1)
df['Taxable Income ($m)'] = loss_carry_forward.accumulate(
operating_income.values, dtype=np.object)
# Calculate Income Tax of IOC
income_tax = .3 # 30%
df['Income Tax ($m)'] = df['Taxable Income ($m)'].multiply(income_tax)\
.clip(lower=0)
df[['Signature Bonus ($m)', 'Taxable Income ($m)',
'Income Tax ($m)']].T.style.format("{:,.0f}")
df['IOC Net Cash Flow ($m)'] = (df[['Cost Recovery ($m)',
'IOC Profit Oil Share ($m)']].sum(axis=1) -\
df[['Capital Costs ($m)',
'Operating Expense ($m)',
'Signature Bonus ($m)',
'Income Tax ($m)']].sum(axis=1))
discount_rate = 12 / 100 # discount rate = 12 % (half-year)
df['IOC DCF @ 12% (half-year)'] = df['IOC Net Cash Flow ($m)'] / \
(1 + discount_rate)**(df.index - 0.5)
df['State Net Cash Flow ($m)'] = df[['Signature Bonus ($m)', 'Royalty ($m)',
'State Profit Oil Share ($m)',
'Income Tax ($m)']].sum(axis=1)
df['State DCF @ 12% (half-year)'] = df['State Net Cash Flow ($m)'] / \
(1 + discount_rate)**(df.index - 0.5)
# Highlight of State Cash Flow Sources (uncomment to view)
"""df[['Signature Bonus ($m)', 'Royalty ($m)','State Profit Oil Share ($m)',
'Income Tax ($m)','State Net Cash Flow ($m)',
'State DCF @ 12% (half-year)']].T.style.format("{:,.0f}")"""
df.T.style.applymap(lambda x: 'color: red' if x<0 else
'color: black').format("{:,.0f}")
Thus, we got the same spreadsheet as presented in the referenced book. Besides, let’s calculate NPV and IRR for both the IOC and State. Finally, we will plot net cash flows.
# NPV and IRR for IOC
npv_ioc = df['IOC DCF @ 12% (half-year)'].sum()
irr_ioc = np.irr(df['IOC Net Cash Flow ($m)'])
print('IOC NPV: \t${:,.0f} \tIRR: \t{:.1f}%'.format(npv_ioc*1e3, irr_ioc*100))
# NPV and IRR for State
npv_state = df['State DCF @ 12% (half-year)'].sum()
irr_state = np.irr(df['State Net Cash Flow ($m)'])
irr_state = irr_state if not np.isnan(irr_state) else 0
print('State NPV: \t${:,.0f} \tIRR: \t{:.1f}%'.format(npv_state*1e3,
irr_state*100))
# Sum of net cash flows for State and IOC
series = df[['State Net Cash Flow ($m)', 'IOC Net Cash Flow ($m)']].sum()
series.index = ['State', 'IOC']
series.name = 'Cumulative Net CF (%)'
series.plot.pie(autopct='%.2f', fontsize=14, figsize=(4, 4));
# Computing cumulative cash flow
cum = df[['State Net Cash Flow ($m)',
'IOC Net Cash Flow ($m)']].cumsum().rename(
columns={'State Net Cash Flow ($m)':'Cumulative State CF ($m)',
'IOC Net Cash Flow ($m)':'Cumulative IOC CF ($m)'})
# Plotting the Cash Flows
ax1 = df[['State Net Cash Flow ($m)',
'IOC Net Cash Flow ($m)']].reset_index(drop=True).plot(kind='bar',
grid=True)
ax2 = ax1.twinx()
cum.reset_index(drop=True).plot(kind='line', ax=ax2, grid=True, alpha=1,
legend=True, figsize=(15,10),
title='IOC and State Cash Flow Projection');
ax1.set_xticklabels(df.index, rotation=0);
In Part 1 we built a financial model for production sharing agreement with python scripts. The next part will show the way for the estimation of risk.
To deal with uncertain world, the modeller should assess the influence of external and internal project factors to investment aggregates like NPV and IRR. To solve this task we will do the following:
Below are the python implementation and scripts with comments.
class PSAFinModel(object):
"""An object for calculating and modeling of financial data
and cash flow of production sharing agreement
"""
def __init__(self, prod_cap_op_cost, parameters_dict):
"""Input data:
"""
self.prod_cap_op_cost = prod_cap_op_cost
self.p = parameters_dict
# cash-flow projection DataFrame
self.df = pd.DataFrame(np.zeros((19,23)), index=[*range(1,20)])
self.df.columns = [
'Production (1,000 bbl)', 'Oil Price ($/bbl)', 'Gross Revenue ($m)',
'Royalty ($m)', 'Net Revenue ($m)', 'Capital Costs ($m)',
'Depreciation ($m)', 'Operating Cost ($/1000 bbl)',
'Operating Expense ($m)', 'Total Expenses ($m)',
'Cost Recovery Limit ($m)', 'C/R C/F ($m)', 'Cost Recovery ($m)',
'Total Profit Oil ($m)', 'State Profit Oil Share ($m)',
'IOC Profit Oil Share ($m)', 'Signature Bonus ($m)',
'Taxable Income ($m)', 'Income Tax ($m)', 'IOC Net Cash Flow ($m)',
'IOC DCF @ 12% (half-year)', 'State Net Cash Flow ($m)',
'State DCF @ 12% (half-year)']
self.df.index.name = 'Year'
# project kpi DataFrame
self.kpi = pd.DataFrame(np.zeros((6,3)),
columns=['IOC', 'State', 'Project'],
index=['CCF ($m)', 'NPV ($m)', 'IRR (%)',
'PP (y)', 'DPP (y)', 'PI'])
def get_ncf(self):
"""Step by step calculation of net cash flow of IOC and State
"""
# Set input data
self.df[['Production (1,000 bbl)', 'Capital Costs ($m)',
'Operating Cost ($/1000 bbl)']] = self.prod_cap_op_cost.values
# Revenue
# Set a price USD per bbl. for each year
self.df['Oil Price ($/bbl)'] = self.p['Oil Price ($/bbl)']
# Calculate Gross Revenue, in thousand USD
self.df['Gross Revenue ($m)'] = self.df[['Production (1,000 bbl)',
'Oil Price ($/bbl)']].\
prod(axis=1) # $m
# Calculate Royalty and net Revenue
self.df['Royalty ($m)'] = self.df['Gross Revenue ($m)'].\
multiply(self.p['Royalty Rate'])
self.df['Net Revenue ($m)'] = self.df['Gross Revenue ($m)'] - \
self.df['Royalty ($m)']
# Capital Costs and Depreciation
# Find first of operation year (lets play algo)
prod_gt0 = (self.df['Production (1,000 bbl)'] > 0)
first_year_of_operation = self.df.index[prod_gt0][0]
# Calculate depreciation, starting no earlier than 1st year of operation
depr_schedule = pd.DataFrame(0., index=self.df.index,
columns=self.df.index[self.df['Capital Costs ($m)']>0])
# Assign yearly depreciation values
depr_schedule.loc[1:self.p['Depreciation Term'],:] = \
self.df['Capital Costs ($m)'].\
where(self.df['Capital Costs ($m)']>0) / \
self.p['Depreciation Term']
# Shift depreciation to proper years
depr_schedule = depr_schedule.apply(lambda x:
x.shift(max(first_year_of_operation,
x.name) - 1,
fill_value=0), axis=0)
# Finally, sum all depreciation schedule into net Depreciation
self.df['Depreciation ($m)'] = depr_schedule.sum(axis=1)
# Operating Costs
# Calculate Operating Expense
self.df['Operating Expense ($m)'] = self.df[['Production (1,000 bbl)',
'Operating Cost ($/1000 bbl)']].prod(axis=1)
# Total Expense
self.df['Total Expenses ($m)'] = self.df[['Operating Expense ($m)',
'Depreciation ($m)']].sum(axis=1)
# Cost Recovery
self.df['Cost Recovery Limit ($m)'] = self.df['Gross Revenue ($m)'] * \
self.p['Cost Recovery Limit']
# Cost Recovery Carry Forward
self.df['C/R C/F ($m)'] = (self.df['Total Expenses ($m)'] -
self.df['Cost Recovery Limit ($m)']).\
cumsum().clip(lower=0)
self.df['Cost Recovery ($m)'] = np.minimum(
self.df['Total Expenses ($m)'] +
self.df['C/R C/F ($m)'].\
shift(1, fill_value=0),
self.df['Cost Recovery Limit ($m)'])
# Profit Oil
# Calculate Profit Oil
self.df['Total Profit Oil ($m)'] = self.df['Net Revenue ($m)'] - \
self.df['Cost Recovery ($m)']
# Calculate split of profit oil between IOC and State
self.df['State Profit Oil Share ($m)'] = \
self.df['Total Profit Oil ($m)'] * \
(1 - self.p['IOC Profit Oil Split'])
self.df['IOC Profit Oil Share ($m)'] = \
self.df['Total Profit Oil ($m)'] * \
self.p['IOC Profit Oil Split']
# Bonuses and Taxes of IOC
# Signature bonus in case a PSA is signed
self.df.at[1,'Signature Bonus ($m)'] = self.p['Signature Bonus']
# Calculate operating income
operating_income = (self.df[['Cost Recovery ($m)',
'IOC Profit Oil Share ($m)']]\
.sum(axis=1) -\
self.df[['Total Expenses ($m)',
'Signature Bonus ($m)']].sum(axis=1))
# Create function and calculate loss carry forward (no term limits)
loss_carry_forward = np.frompyfunc(lambda a,b: a+b if a < 0
else b, 2, 1)
self.df['Taxable Income ($m)'] = loss_carry_forward.accumulate(
operating_income.values, dtype=np.object)
# Calculate Income Tax of IOC
self.df['Income Tax ($m)'] = self.df['Taxable Income ($m)']\
.multiply(self.p['Income Tax']).clip(lower=0)
# IOC Free Cash Flow
self.df['IOC Net Cash Flow ($m)'] = (self.df[['Cost Recovery ($m)',
'IOC Profit Oil Share ($m)']].sum(axis=1) -\
self.df[['Capital Costs ($m)',
'Operating Expense ($m)',
'Signature Bonus ($m)',
'Income Tax ($m)']]\
.sum(axis=1))
self.df['IOC DCF @ 12% (half-year)'] = \
self.df['IOC Net Cash Flow ($m)'] / \
(1 + self.p['Discount Rate'])**(self.df.index - 0.5)
# State Free Cash Flow
self.df['State Net Cash Flow ($m)'] = \
self.df[['Signature Bonus ($m)', 'Royalty ($m)',
'State Profit Oil Share ($m)','Income Tax ($m)']].sum(axis=1)
self.df['State DCF @ 12% (half-year)'] = \
self.df['State Net Cash Flow ($m)'] / \
(1 + self.p['Discount Rate'])**(self.df.index - 0.5)
return self.df[['IOC Net Cash Flow ($m)',
'IOC DCF @ 12% (half-year)',
'State Net Cash Flow ($m)',
'State DCF @ 12% (half-year)']]
def get_kpi(self):
"""Module to calculate investment project indicators:
Cumulative Cash Flow, Net Present Value, Internal Rate of Return,
Payback Period, Discounted Payback Period, Profitability Index
for investment decision making.
"""
def payback_period(ts):
"""Function to calculate PP with for input time-series
"""
if not (ts[ts.cumsum() > 0].empty | ts[ts.cumsum() < 0].empty):
final_full_year = ts[ts.cumsum() < 0].index.values.max()
fractional_yr = - ts.cumsum()[final_full_year] / \
ts[final_full_year + 1]
pp = (final_full_year + fractional_yr)
pp = round(pp, 1)
elif ts[ts.cumsum() < 0].empty:
pp = 0
else:
pp = np.nan
return pp
profitability_index = lambda ts:round(((ts>0)*(1-ts.cumsum()
/ ts.cumsum().min())).max(), 1) if ts.min() < 0 else np.inf
irr = lambda dcf: round(np.irr(dcf)*100, 1) \
if not np.isnan(np.irr(dcf)) else 0
# a loop to get kpi values
for i, (cf, dcf) in enumerate([(self.df['IOC Net Cash Flow ($m)'],
self.df['IOC DCF @ 12% (half-year)']),
(self.df['State Net Cash Flow ($m)'],
self.df['State DCF @ 12% (half-year)']),
(self.df[['IOC Net Cash Flow ($m)',
'State Net Cash Flow ($m)']].sum(axis=1),
self.df[['IOC DCF @ 12% (half-year)',
'State DCF @ 12% (half-year)']].sum(axis=1))
]):
self.kpi.iloc[:,i] = [round(cf.sum(), 1),
round(dcf.sum(), 1),
irr(cf),
payback_period(cf),
payback_period(dcf),
profitability_index(dcf)]
self.kpi.index.name = 'Parameter'
return self.kpi
# Set input parameters
# Dictionary with major inputs
psa_parameters_dict = {
'Oil Price ($/bbl)': 20, # Oil price $20/bbl for the model estimation
'Royalty Rate': 0.1, # Royalty rate 10%
'Cost Recovery Limit': 0.5, # Up to 50% of Gross Revenue
'IOC Profit Oil Split': 0.4, # IOC 40%/ State 60%
'Income Tax': 0.3, # CIT 30%
'Signature Bonus': 1e7/1e3, # Signature Bonus $10M
'Depreciation Method': 'SL', # Straight Line depreciation of capital assets
'Depreciation Term': 5, # 5 years
'Discount Rate': .12 # 12% discount rate
}
# DataFrame with input time-series
input_estimates = pd.DataFrame(
{'Production (1,000 bbl)': [0, 0, 578, 6100, 9420, 12400, 10850, 9494, 8307,
7269,6360, 5565, 4869, 4261, 3728, 3262, 2854,
2498, 2185],
'Capital Costs ($m)': [30000, 40000, 100000, 60000, 70000] + [0]*14,
'Operating Cost ($/1000 bbl)': [0, 0, 5.5, 2.6, 2.4, 2.3, 2.36, 2.4, 2.46,
2.54, 2.64, 2.72, 2.82, 2.94, 3.08, 3.24,
3.4, 3.6, 3.4]
},
index=[*range(1,20)]
)
After the above manipulation it takes just a few strings of code to calculate prognosis cash flow.
# initiate object for calculations
psa = PSAFinModel(prod_cap_op_cost=input_estimates,
parameters_dict=psa_parameters_dict)
# calculate CF and DCF with get_ncf method
psa.get_ncf();
# calculate project kpi's
kpi = psa.get_kpi()
kpi
For this part of modelling we will implement calculation of project NPV on deviation in each of the following inputs: production, oil price, capital costs and operational cost with range from -50% to 50%.
class PSAFinModelSens(PSAFinModel):
"""Class to deal with sensitivity
"""
def get_sensitivity(self):
"""Module to calculate NPV sensitivity to deviation of price,
production, capital and operating costs
"""
# store 'base' input data for further reference
if not hasattr(self, 'base'):
self.base = self.prod_cap_op_cost.copy(deep=True)
self.oil_price_base = self.p['Oil Price ($/bbl)']
# Create a dataframe to store the resulting data
res = pd.DataFrame(0., columns=['Production (1,000 bbl)',
'Oil Price ($/bbl)',
'Capital Costs ($m)',
'Operating Cost ($/1000 bbl)'],
index=[*range(-50,51,10)])
res.index.name = '%'
# Calculate the NPV for each deviation from the base value
for i in res.index:
k = i * 0.01 + 1
for c in res.columns:
if c == 'Oil Price ($/bbl)':
self.prod_cap_op_cost = self.base.copy()
self.p['Oil Price ($/bbl)'] = self.oil_price_base * k
else:
self.prod_cap_op_cost = self.base.copy()
self.p['Oil Price ($/bbl)'] = self.oil_price_base
self.prod_cap_op_cost[c] = \
self.prod_cap_op_cost[c].values * k
self.get_ncf()
res.at[i, c] = self.df['IOC DCF @ 12% (half-year)'].sum()
return res
# Creating an instance of the object
psa_sens = PSAFinModelSens(prod_cap_op_cost=input_estimates,
parameters_dict=psa_parameters_dict)
# Get the dataframe with sensitivity analysis results
sens = psa_sens.get_sensitivity()
# Plot the results of Sensitivity analysis
sens.index = sens.index.astype(str) + '%'
ax3 = sens.astype(int).plot(figsize=(12,7), alpha=0.8, grid=True,
title='IOC NPV Sensitivity ($m)', table=True)
# Some finer adjustment for the plot
ax3.get_xaxis().set_visible(False)
tb = ax3.tables[0]
tb.auto_set_font_size(False)
tb.set_fontsize(8)
# Create a copy of reference dictionary with input data
psa_parameters_dict_be = psa_parameters_dict.copy()
def get_npv(oil_price, discount_rate=12):
"""Function with price and discount rate input
to return NPV of IOC
"""
global input_estimates, psa_parameters_dict_be
psa_parameters_dict_be['Oil Price ($/bbl)'] = oil_price
psa_parameters_dict_be['Discount Rate'] = discount_rate / 100
psa = PSAFinModel(prod_cap_op_cost=input_estimates,
parameters_dict=psa_parameters_dict_be)
psa.get_ncf()
return (psa.df['IOC DCF @ 12% (half-year)'].sum())
# Create DataFrame to store NPV values for different discount rate and oil price
npv_profile = pd.DataFrame(0., index=np.linspace(5, 24, 20),
columns=[15, 20, 25])
for dr in npv_profile.index:
for op in npv_profile.columns:
npv_profile.at[dr, op] = get_npv(op, dr) * 1e-3
# Plot the resulting NPV Profile and spread
(hv.HLine(0).opts(line_width=1, color='black') * hv.Spread((npv_profile.index,
npv_profile[20],
(npv_profile[20]-npv_profile[15]), (npv_profile[25]-npv_profile[20])),
vdims=['y', 'yerrneg', 'yerrpos']).opts(fill_alpha=0.3)\
.opts(color=hv.Palette('Greens')) * \
hv.Curve((npv_profile.index, npv_profile[20])) \
.opts(color=hv.Palette('Greens'))).opts(
title='NPV Profile for oil prices 20 ± 5 ($/bbl)', width=800, height=400,
show_grid=True, ylabel='IOC NPV ($M)', xlabel='Discount Rate (%)')
This approach on the figure above shows us a way to estimate breakeven price, in the point where the green line crosses zero. So far, we will introduce the separate function for the calculation of breakeven oil price.
def get_breakeven(x):
"""Function with price input to return NPV squared,
which allows to find breakeven price minimizing NPV to zero
"""
return (get_npv(x[0]))**2 # the function from NPV Profile is used
# getting breakeven price
min_res = minimize(get_breakeven, 20, tol=1e-2)
breakeven_price = min_res.x[0]
print('IOC breakeven oil price: \t${:,.3f} $/bbl'.format(breakeven_price))
For now, we consider the Monte Carlo approach, which repeats calculation many times with the input data sampled from the predefined distributions. Let us chose truncated standard normal and triangle distributions (BetaPERT distribution, or any other, can be implemented similarly).
N = 1000000 # amount of samples from the distributions
# generating truncated normal distribution samples and scaling them
mc_truncnorm = truncnorm.rvs(-2., 2., size=N) * 0.25 + 1
# generating triangle distribution samples and scaling them
mc_triang = triang.rvs(.5, size=N) + 0.5
# creating DataFrame with samples and plotting it
df_mc = pd.DataFrame({'Truncated Normal': mc_truncnorm, 'Triangle': mc_triang})
df_mc.plot(kind='hist', bins=100, alpha=0.4);
We can see that both distributions quite similar. For more balanced result the truncated normal will be selected for further computations.
To proceed with quite a lot of paths we will need to implement multiprocessing calculations for speeding up. Let’s see where it goes.
%%time
N = 100000
# creating N Monte Carlo paths (deviations of initial input data)
mc = truncnorm.rvs(-2., 2., size=(N, 19, 3)) * 0.25 + 1
def get_npv_irr(mc):
"""Function to get NPV and IRR for each Monte Carlo path
"""
# getting the previously set input parameters
global input_estimates, psa_parameters_dict
# make a copy for further adjustment
psa_parameters_dict_mc = psa_parameters_dict.copy()
input_estimates_mc = input_estimates.copy()
# set the MC path values according to input 2D array
psa_parameters_dict_mc['Oil Price ($/bbl)'] *= mc[0, 0]
input_estimates_mc *= mc[:, :3]
psa = PSAFinModel(prod_cap_op_cost=input_estimates_mc,
parameters_dict=psa_parameters_dict_mc)
psa.get_ncf()
return [psa.df['IOC DCF @ 12% (half-year)'].sum(),
np.irr(psa.df['IOC Net Cash Flow ($m)'])*100]
# implementing multiprocessing to speed up calculations
if __name__ == '__main__':
with Pool() as p:
res = p.map(get_npv_irr, mc)
# storing the resulting NPV and IRR values into DataFrame for plotting
mc_res_ = pd.DataFrame(res, columns=['NPV ($m)', 'IRR (%)'])
ax = mc_res_.plot.density(subplots=True, sharex=False, sharey=False,
figsize=(15,8), grid=True, bw_method=3, yticks=[],
title=('IOC Probability Density Distribution for' +
' NPV and IRR (Monte Carlo Simulation)'));
ax[0].axhline(0, c='grey')
ax[1].axhline(0, c='grey')
plt.show()
Lastly, we will present an interactive dashboard HoloMap for presentation purposes. Current implementation will show preliminary calculated results, whereas DynamicMap could recalculate on the go thus presenting even more powerful options. We will create a menu to manually change project parameters that we previously modeled as well as some others.
def dashboard(rr, crl, split, cit, price, prod, capex):
"""A function to create visuals for plotting in the dashboard
"""
# inputs that we use before
global input_estimates, psa_parameters_dict
# making a copy not to change the base values
params = psa_parameters_dict.copy()
estimates = input_estimates.copy()
# adjusting inputs according to the value of a parameter in dashboard
params['Royalty Rate'] = rr / 100
params['Cost Recovery Limit'] = crl / 100
params['IOC Profit Oil Split'] = split / 100
params['Income Tax'] = cit / 100
params['Oil Price ($/bbl)'] = price
estimates['Production (1,000 bbl)'] *= (prod / 100)
estimates['Capital Costs ($m)'] *= (capex / 100)
# initiate object for calculations
psa = PSAFinModel(prod_cap_op_cost=estimates,
parameters_dict=params)
# calculate CF and DCF with get_ncf method
psa.get_ncf()
# calculate project kpi's
kpi = psa.get_kpi()
# create a holoviews table for the dashboard
table = hv.Table(kpi.reset_index()).options(width=700,
title='Project Performance')
# prepare data and create holoviews bar chart for the dashboard
df_fcf = psa.df[['IOC Net Cash Flow ($m)',
'State Net Cash Flow ($m)']].\
rename(columns={"IOC Net Cash Flow ($m)": "IOC Net Cash Flow",
"State Net Cash Flow ($m)": "State Net Cash Flow"}).\
div(1e3).reset_index().melt(id_vars=['Year'], var_name='Entity',
value_name='Value')
bars = hv.Bars(df_fcf, ['Year', 'Entity'], hv.Dimension('Value', unit='$M',
range=(-120, 300)))\
.options(color=hv.Cycle('Category20'), show_legend=True, stacked=True,
tools=['hover'], height=400, width=700,
legend_position='top_right', title_format='Project Cash Flow')
return (table + bars).cols(1)
# Creating a dictionary to contain visuals
dashboard_dict = {(rr, crl, split, cit, price, prod, capex):
dashboard(rr, crl, split, cit, price, prod, capex)
for rr in [5, 8, 10]
for crl in [50, 60]
for split in [40, 50, 60]
for cit in [15, 20, 30]
for price in [15, 20, 25]
for prod in [90, 100, 110]
for capex in [90, 100, 110]
}
# Drawing a Dashboard with scenarios
OutputSettings.options['max_frames'] = 2000
kdims=['Royalty Rate (%)', 'Cost Recovery Limit (%)',
'IOC/State Profit Oil Split (%)', 'Income Tax (%)', 'Oil Price ($/bbl)',
'Production (%)', 'Capital Cost (%)']
hv.HoloMap(dashboard_dict, kdims=kdims, sort=True).collate().opts(title='')
Upper four parameters (royalty rate, cost recovery limit, profit oil split and income tax) are characteristics of PSA terms and conditions. One could simply find that they are the elements of a zero sum game between IOC and State. The bottom three parameters (oil price, production level and capital costs) are mostly external, although production, and capital cost are more or less manageable by the IOC operator.
Some parts of this model may look like overcomplicated compared to excel. It’s probably a matter of dealing with new technology, which may just be different from the software that you already know.
Let us summarize macro-level benefits of building financial models with python:
Regarding the above PSA financial model specifically, it could be augmented by models to estimate reserves and production profile, as well as perform modeling capex and opex breakdown for the better analysis of alternative scenarios. Moreover, such models could be easily combined into portfolios for analysis on the portfolio level. In addition, frequency change from years to quarters (or months), introducing working capital and other improvements could easily be implemented.
However, the evident disadvantage of this approach is the lack of python programming skills among financial industry professionals.
For the convenience of the reader, this paper is also available as an HTML and Jupyter notebook at Github.
Please, feel free to comment or/and provide your feedback to the author.
Mikhail Shishlenin, 2/2/2020